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SEQUENCE LISTING 

(1) GENERAL INFORMATION; 



(i) APPLICANT: 

(A) NAME: Fred Hutchinson Cancer Research Center, Inc. 

(B) STREET: 1100 Fairview Avenue North, Mai Is top C2M-02 7 

(C) CITY: Seattle 

(D) STATE: Washington 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 98109 



(A) NAME: Thomas Spies 

(B) STREET: 2429 E. Aloha 

(C) CITY: Seattle 

(D) STATE: Washington 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 98112 



(A) NAME: Veronika Spies 

(B) STREET: 2429 E. Aloha 

(C) CITY: Seattle 

(D) STATE: Washington 
Q ( E ) COUNTRY : USA 

t|3 (F) POSTAL CODE (ZIP) : 98112 

CO 

IJ} (ii) TITLE OF INVENTION: CELL STRESS REGULATED HUMAN MHC CLASS I GENE 

m 

m (iii) NUMBER OF SEQUENCES: 16 

* s 

hi (iv) COMPUTER READABLE FORM: 

'J* (A) MEDIUM TYPE: Floppy disk 

^ (B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 
jp/ (D) SOFTWARE: Patentln Release #1.0, Version #1.3 0 (EPO) 

j£ (vi) PRIOR APPLICATION DATA: 

J** (A) APPLICATION NUMBER: US 60/029,044 

(B) FILING DATE: 29-OCT-1996 



(2) INFORMATION FOR SEQ ID NO : 1: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11722 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



CACTGCTTGA GCCGCTGAGA GGGTGGCGAC GTCGGGGCCA TGGGGCTGGG CCCGGTCTTC 6 0 

CTGCTTCTGG CTGGCATCTT CCCTTTTGCA CCTCCGGGAG CTGCTGCTGG TGAGTGGCGT 12 0 

TCCTGGCGGT CCTCGGCGGA GCGGGAGCAG TGGGACGTTT CCGGGGGTCG GGTGGGTAGC 18 0 



GGCGAGCGCT 'GTGCGGTCAG GGCGGGGCTC CTGTGCCCTG TCGGTGGCGC AGGGAGCTGG 



2 4 0 
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ACGCGGCCCG TTACCGCCAC ACTTCAGCCC TGCTTCCCCG TCACTTTTCA GTCCTCCTCG 3 00 

GGATCGCGCA TCACCTGCAC TTTCTGGTCT CCTCCTGCTC TTTCTCTCCT CGCGTCTCCT 360 

CCGCTTCCTC TCACTTTTCG GACAAACCAG TCCTTCTGAG GCCCATGGGT TCCCGGGCTG 4 20 

CCTCCGGGGC TGCTCCTGTG AATGGCATTC GAGTGCCCTT CCAGCGCGGC CACTGAAGCA 480 

GCCACAACCC CCGGTGCTCG GGGCGGCTCT CAGGTCCCTG AAGTCCTGTC CTCTCCCGGA 54 0 

GCCGACGTGT TCTCAGCTCC TGGGCCGCAG CTCCTGGAGT AGGGGCCCTC CTTTCTCGGG 6 00 

ACCCGGAGCT GGTGCTTCCT GCTGCTGTGG GGACTGTGGG GGGTCCTGAC TCTCAAGCTG 660 

AGGGGTTGGA GTCTGCAGGC TCCGGGCAGA GGATTCTTCC TGCGACTTCT CTCATCCCCA 72 0 

GCTCATTCTC CCCTCGCCTC TGGCTCCGAG GGTCCTCTCC TCTCTCTCAT CCCACCCCTA 78 0 

CTAATGACCA GTGATCTAAG G AC AC C AG AT TCCCTCTCAC CTCCTCCCTG CCCATCTCAG 84 0 

GGCCCGCTGA GTCCTTTTGC CCTCCCAGCT CCCTGCTACC CCTTCCTGTG TGCTGTTCTC 9 00 

O TGATC CATTT CTAGGGTGTC CTCTGCCCTC ATCCCCTGTC CCCGCCACCG AAGTCCCTCC 96 0 

0? TGCACCCCTT ATGGGCCTTT CCTACAAGCA GCCTTCACCC AGTGCTGCCC CTATGCCTCC 102 0 

hi 

U1 CCGTTCCCAA ATGTCCCTGA CTCTAACTTT CTGGTGCTGC CTTTTATCCG GGGGGGTCTT 10 8 0 

m 

H CCCTCCATCC CACTCCCCTC CAGACCCCCA AGGGGAACCC TGATGCTAAT GGCAGTTGGG 114 0 

s CCTTAGGCAG GGCGCAGGGC AGCGCAGATG CCCCCTCCCC TCCAGTGCAG ATGCCTGTTC 12 00 

o 

yi TGGACCCTGC CTCATTGTGG CCCCTTCCCC ACTCCTTCAT CCTCAGCCTC ACCCTCTTGA 126 0 

£ GGACCCCACC CTCCAGCCCA CAGGTGCTGG ACCATCCCTC CCTGGTCCCT CCGCCCCTCT 13 2 0 

CCACCTTGGG ACCTTGTGCT GCTCCTATCT CTTGCCCAGC TGCCTTGGGC CCTCAGCACG 13 8 0 

TTCTCATCTT TCAGTGGGAA AGTGGGAGTG CTGG AG CAT A TGACAGTGCT GAGCATCTTT 14 4 0 

CCCAAGCCCC ACCCTCCCCC AGAGCACCCT CCCCTCCTGT CCTCACCCTA CCCCAAGTTC 1500 

TCCCACAGTC ACTCCTGCCC CATGCTCATG CCGCCCTCCA GTTCTTGCTC TGCCCATCTC 156 0 

CCCTCCCCAA CCCAGACCTA AAACAGGCTG TTGGGCCAAC TGTTCCTTGA CCTTCCTTCT 16 2 0 

TTTCTTTTGG TTCCTTGACC CCAGTGGGCT CTCACTCCCC ACACCGCATA TCTAAAATCT 16 8 0 

GTTTTGCCTG CTCTTGGGGT GCCACTGCTC CCCCTCCAGC ATTACTCCTT TTGGCAGGTC 174 0 

CTTCCTCAGG CTGAGAATCT CCCCCTCTAC CTTGGTTTTC TCTCTCTGGC CAGCACCCCC 18 0 0 

ACTCCTTGCT TTGTTTTTAA TTTTTAACTT TTGTTTGGGT ACGTAGTAGA TATATATGTA 186 0 

TATATTTATG GGGTACATGG G AT ATTTTG A CACAGGCCTA CAATATGTAA TAATCACATC 19 2 0 

AGGGTAAATG GGTTATATCA CAACAAGCAT TTATCCTTTC TTTGTGCTAC AAACAATCCC 1900 



ATTATGCTCT TTCAGTTATT TTTAAATGTA 

GCTGTGCTAT CTACTAGATC TTATTCATTC 

TCCCTGCTCC CCCACTCCCC ACTACCCTTC 

TCTCCCCATG AGGTCCATTG TTTTAAATTT 

AGTTTGTCTG TCTGGGCCTG GGGCTTATTT 

AAATGACACG ATGGCTGAAT AGTTCTC C AC 

ATGCGTCTGT TGATGGACAC TTAGATTGCT 

CAATAAACAT GGAAAAGTAG ATAGCTCTTT 

TGCCTAACAG TGGGAGTGCT GGAGCATATG 

GAACCTCCAC ATTGTTTCCC ATAGTGGTTG 

ATCCTCACCA GCATTCCTTA TTTCTACATC 

GGATAAAAGC CAGTTTATCT GGGGTGGGAT 

CO ATCTGTTGAC GAATGATGTT GAGCACCTTT 

m 

U1 CTTTTGAGAA ATGACTATTC AGATCTTTTC 

TTCCTATAGT TGTTCGAGCT CCTTATATGT 

pi 

s GTTTGAAAAT ATTTTCTCCC ATTCTTGGAT 

o 

IFg GCTGTGCAGA AGCCTTTTTA CTTGATATGA 

> TGTGCTTGTG GGGTATTACT TTAAAAATCT 

O 

£T CCAATGTTTT CTTGTATAGT TTCATAGTTT 

p— 

TTGATTTGAT TTTTGTATAT GGTGAAAGAC 
TATCTAGTTT CCCCAGCACC ATTTTTGAAG 
TACCTTTGTT GGAAATGAGT TTACTGTAGA 
TGTTTCATTG GTCTGTGTGT CTGTTTTTAT 
CTCTGTAGTA TAATTTGAAG TCAGATAATG 
GGATAGCTTT ATCTATTCTG GTTTTTTTGT 
TTATTTCTGT GAAGAATGTC ATTAGTGTTT 
CTTTGGGTAG TATGGATATT TCAACAAAAC 
CTTTTCCATT TTTTGTGTCC TTCAATTTTT 
AGATGGAGTT TCACTCTTGT TGCCCAGGCT 
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CAATAAATTA TTGTTGACTG TACTCACCCT 2 04 0 

TAATTATATT TTTGTACCCA TTATTAACCA 2100 

TCAGCCTCTG GTAATCATCA TTCTATTGTC 216 0 

TGGCTGCCAC AAATAAGTGA GAACATGCAA 222 0 

CACTTCACAG GATGACCTCC AGTTCTTTGC 22 8 0 

AT AC AC ATG T ACACCACATT TTCTTTATCC 2 340 

TGCAGAT.CT.T. GGCTACTTTG AATAGTGCTG 24 00 

AATATACCGA TTTCCTTTCT TTGGAGTATA 24 6 0 

ACAGCTCTAT TGTATTTTTA GTTTTTGGAA 2 52 0 

TACTAGTTTA CGTTCCCACC AACAGTGTAC 2 58 0 

CTCGCCAGCA TTCCTTATTG CCTGTCTTCT 2 64 0 

GTTATCTCGT AGGAGTTTTG ATTTGCCTTC 2 7 00 

TCATATACCT GTTTGCCATT TATATGTCTT 2 76 0 

TCATTTTTAA ATTGGATTAT TATATTTTTT 282 0 

TTCAGTTACT GATCCTTTGT CAGATGAATA 28 80 

GGTCTCTTCA TTTTGTTTAT TGTTTCCTTT 2 94 0 

TCCCATTTAT GCAATTTTAC TTTGGTT AC C 3 000 

TTGCCCAGTC CAATATCCTA GAGAGTTTCC 3 06 0 

GAGGTCATAG ATTTACATCT TTAATCCACT 312 0 

AGGGTCTAGT TTCATTCTTC TGCATAAGGA 318 0 

AGACTCTCCT TTGCCAATGT GTGTTCTTGG 324 0 

TGTATGGAAT TGTTTCTGGG TTCTCTATTC 3 3 00 

GCCAGTATCA TGCTGTTTTG GTTACTGTAG 3360 

TGATTCCTCT AGTTTTGTTC ATTTTGCTCA 34 2 0 

GGTTCCATAT GCATTTTAGG ATTATTTTTA 3480 

TGATAGGGAT TGCATTGAAT CTGTAGATTA 3 54 0 

TGATTCTTCC AATCCATGAA CGTGGACTAT 36 00 

TGCATCAGTG TTTTTTGTTT TTGGTTTTTG 366 0 

AGAATGCAAG GGTGTGATCT TGGCTCACCG 3 72 0 
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CCAAGTTGTA GTATGGGTCA GAATTTCATT CCTTTTAAGG ATGGATAATA CTCATTATAT 5 52 0 

GTATGTACCA CATCTTGGTT ATCCATCCCT CAGACAATGG ACACTTGGGT TACTTCTACC 5 58 0 

TTTTGGATAT TGGCAAATAT TTCATTTCCT TTGGGTATAT ATTTATTTCC TTTGGGTATT 564 0 

TCTTTTGGGT ATATATCCAG AAATAGAAGC AGTACACAGG GGCTTCATTT TCTCTGTCTC 57 00 

TTTGCCAACC TTGCTCTGTG TGTGTGTGTA TGTGTGTGTG TAGGTGTGTG ATAACAGCCA 576 0 

TCCTGATTGG TTTCAGGTGG CATCTCATTG TGGTTTGGAT TTGCATTTTC CTAATGAGTG 582 0 

CTGATATTGA GCATCTTTTC ATGTGTTTGT TGATCATTTG TAATTTTCTT TGAAGAATTG 5 8 80 

GCCATTTAAG TCTTTTGCCC ATTTTTTCCC CCACATAGCT TCTCTTATCA GATATATGAC 594 0 

TTGCAATATT TATTTCATTT CGGGGTTGAT TGCTTTTTCA CTCTGATTGT GC CCTTTG AT 6 00 0 

GCATAGATGT TTTGAATTTT CATCAGTCTA CTTTGTCAGT TCTTTCTATT CTATCTGTGC 6 06 0 

TTTGGTGTCA TATCCATGAA AGCACTGTCA AATCCTATGT CATGAACATT ATCCCCAATG 612 0 

? TTTGCTTCTA AGAAATTTTT AGGTTTTAGT TCTTGAGTGT AGAGTTTAGG TCTTTGATTC 618 0 

ATTTTGAGTT AATTTTTGTA TATAGTGCAA ATTAAGGGTC CAATTTTATT TTAACACCCC 6 24 0 

CTGCCCCCAG AACTATTTGC TGAAAAGATC AACTGACTCT TTGTCACCTG CTCACCCCAG 6 3 00 

TGGACACTAG CTGTTCCATC CAATTGCTGT CCTGGGGCCT TGTCATGCTA CTCTTCCACT 6 36 0 

ru 

» TTGAACCCAA GCCCACACCG TTCGTTGCTC CCCTCTGGGA TACTGACCCC ACTATAAACT 642 0 

Ff TCTCTGGGGC TACAACCTTC CTACCCTTTG TGCCTCATGA CCACCCCCTC CCTTGTCCCC 64 8 0 



m 
m 

Ul 

01 



y 



GCCATGCCCA TGATGAGTCT CTTCTCGAGG CAGCTCCCCT TGCCTCCATC TCACCCTCAG 6 54 0 

CCTATGCACC ACAGCCACAC TGGACATGGG TCCCTCTGAG CCTGAGTCCC TTCCCATTCC 6600 

CACCATCTCC TCTGGCAAGA CCTTCCTTCC ACCACCTTCA TGCTCCTCCC TTGCCCCTGC 66 6 0 

AGGGCAGCCT CTCCCCTTGG CCCCTATTCC CTTAGGGGGC TTGTGGCCAC CCAGTCCTTG 6 72 0 

CACCTGGCCT ACAAGTTTGC CATCTTCATT CCCCCTTCTT CTGTTCATCA GCCCCCTCCT 6 78 0 

CTATCCTCCC ACCCTCACAG TTTTCTTTGT ATATGAAATC CTCGTTCTTG TCCCTTTGCC 6 84 0 

CGTGTGCATT TCCTGCCCCA GG AAGGTTGG GACAGCAGAC CTGTGTGTTA AACATCAATG 690 0 

TGAAGTTACT TCCAGGAAGA AGTTTCACCT GTGATTTCCT CTTCCCCAGA GCCCCACAGT 6 96 0 

CTTCGTTATA ACCTCACGGT GCTGTCCTGG GATGGATCTG TGCAGTCAGG GTTTCTTGCT 7 02 0 

GAGGTACATC TGGATGGTCA GCCCTTCCTG CGCTATGACA GGCAGAAATG CAGGGCAAAG 708 0 

CCCCAGGGAC AGTGGGCAGA AGATGTCCTG GGAAATAAGA CATGGGACAG AGAGACCAGG 714 0 

GACTTGACAG GGAACGGAAA GGACCTCAGG ATGACCCTGG CTCATATCAA GGACCAGAAA 7 20 0 
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GAAGGTGAGA GTCGGCAGGG GCAAGAGTGA 
GCAGAGAGCA GGGACCTGTC TCTTCCCACT 
GGGGTCAGTG GAACTCAGCA GGGAGGTGAG 
GGGAGGGCCA GGGAGGCGTA CCCCCTGGGC 
GGTTCGGGAA TGGAGAAGTC ACTGCTGGGT 
TAGGGTCTGT GAGATCCATG AAGACAACAG 
TGGGGAGCTC TTCCTCTCCC AAAACGTGGA 
CAGAGCTCAG ACCTTGGCCA TGAACGTCAG 
CAAGACACAC TATCACGCTA TGCATGCAGA 
ATCCAGCGTA GTCCTGAGGA GAAGAGGTAC 
CAATTCTGCT AGAGTTGCCT CACCTCCAAG 
GGATGAAGGC ATTTCCTGTT GGCACATCGT 
TGGATAAAGA CAGTGGGTCA GGGACTGGAC 
GACCCTCCGA CAGAATCCTG AGCCTGTGGT 
GCCAGGGCTG CCCCCTCTGC CTCCCAGCCT 
ACAACCCAGG AGTCCACCCC TGACATCCCC 
CCTGAGGCCA CAGTCCCAAG GCCCATCCTC 
TGAGGACAGA CTTGCAGGTC AGGGGTCCCG 
GAGAAACAGC CCTGTTCCTC TCCCCTCCTT 
CTTTCTTCTC CAGTGCCCCC CATGGTGAAT 
ATCACCGTGA CATGCAGGGC TTCCAGCTTC 
CAGGATGGGG TATCTTTGAG CCACGACACC 
AATGGAACCT ACCAGACCTG GGTGGCCACC 
ACCTGCTACA TGGAACACAG CGGGAATCAC 
AGGGTGACCC TGGAGAGGGT CAGGCCAGGG 
GCCCAGTGTA TAACAAGTCC CTTTTTTTCA 
CAGACATTCC ATGTTTCTGC TGTTGCTGCT 
TTCTATGTCC GTTGTTGTAA GAAGAAAACA 
GGGCAGTTTC TGGAGATGGT AAGGCCCCTG 



CTGGAGAGGC CTTTTCCAGA AAAGTTAGGG 

GGATCTGGCT CAGGCTGGGG GTGAGGAATG 

CCGGCACTCA GCCCACACAG GGAGGCATGG 

TGAGTTCCTC ACTTGGGTGG AAAGGTGATG 

GGGGGCAGGC TTGCATTCCC TCCAGGAGAT 

CACCAGGAGC TCCCAGCATT TCTACTACGA 

GACTGAGGAA TGGACAGTGC CCCAGTCCTC 

GAATTTCTTG AAGGAAGATG CCATGAAGAC 

CTGCCTGCAG GAACTACGGC GATATCTAGA 

GGACGCTGGC CAGGGGCTCT CCTCTCCCTC 

ATGTGTCCAG GGAAACCCTC CCTGTGCTAT 

GTCCTGATTT TCCTCTATTG TTAGAGCCAC 

CATCCAGTGT TGTAATCAGG GCAAGTAGAG 

GGGTGTCAGG CAGGAGAGGA AGCCTTCAGG 

GCCCATCCTG GAGAGTTCCC TCCTGGCCCC 

CTCCTCAGCA TCAATGTGGG GATCCCAGAG 

CTGCCAGCCT GGAAGAACTG GGCCCCAGAG 

GAGGGCTTCA GCCAGAGTGA GAACAGTGAA 

AGAGGGGAGC AGGGCTTCAC TGGCTCTGCC 

GTCACCCGCA GCGAGGCCTC AGAGGGCAAC 

TATCCCCGGA ATATCACACT GACCTGGCGT 

CAGCAGTGGG GGGATGTCCT GCCTGATGGG 

AGGATTTGCC AAGGAGAGGA GCAGAGGTTC 

AGCACTCACC CTGTGCCCTC TGGTGAGCCT 

TAGGGACAGC AGGGATGGCT GTGGCTCTCT 

GGGAAAGTGC TGGTGCTTCA GAGTCATTGG 

GCTGCTGCTG CTATTTTTGT T ATT ATT ATT 

TCAGCTGCAG AGGGTCCAGG TGAGAAAAGC 

TCTGGGCAGT AGGGTCCCCT CATTGCTCCT 
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GCAAAGATAG GCATGTTGGT GACAAGGCTT CTGTAACAGG GGATGAAAGT TGGGGAATTT 90 0 0 

GGGAAGGGAA TGGGGGCAGC ATCTCCATCT ACACCCATAA GTGCTGCCCA AGCGAGGGTC 906 0 

AAACGCCCAG CTGTGGCATC TTCCTGCTGC AGGTGAGGAG TGGGCAGCAG GGAGGGCTGC 912 0 

GGCGCCTGCT CTGTCCCCAT CCCGGTCTCT GTGTCTCTTG GACTCACTAG GGCGCATCCA 918 0 

GGTGGGGTGA GCTGGGAATC ACGTGCTGAA TGCTGAGGGC CTGGATGATC ACGGCCTCAG 924 0 

AGGGAGCAAA TAGTAAAGGC AGCTGTGATC TGGGGAGGGC CAGAAACTGG AGAGGAATCT 93 0 0 

GAGGAGAGGC GGTGCCCCTA TTCCCTTCCT CTCTGCATCC CCCTCCCCTG TTTCTCCAGC 936 0 

CATCGGGGCG GACACCGAGA AAAAGACCTA TGAGGCCCAG CCTGGGGGCC CTGCCTGTGT 942 0 

AGCCCTTTGG AGACCCCTAG TAACAGGGAG GGTCCTGAGC ACACATGGCC ATCTCTGTCC 94 8 0 

ACTGTGCAGC TCCCCATGCA CCTCCTCCAG GAGCTTTCTT GGGGTTGTCG TGTCCTCTGC 954 0 

ACCATTCGAG GCCCTACTCT TTCCAGGTTC CCACGGCCTG GCCTCCCTGA GTTTCTTGCA 96 0 0 

GATGACATGG ATGAGTAGAT AAGCAGATGT CCCTGGGCCA TTTGAGGAGT GGGGCCCAGC 966 0 

^ CCCTCATCAG GGCAGCTGTG GTCCCTGTTT TCATCCTACC TCCGAGTGTT TTCTTCTCCA 972 0 

J£j GTCCCTGAGG GACACAGTCC TCAGGGCCCA TGTTTTTGGG GATTTAATCT GTGCTCTGTG 978 0 

N= GCCTCACCTT GCCTTCCCTG AGCCAATTTC CCTTTCTAAA GGTGGTCACT GCCTGGTAAG 9 84 0 

ru 

3 TTTGGAGTAA GGGACGGTCA GAATCATTTC CCCTACAGTC AGGTTGTTTG ATGGGGGATG 9 900 

s 

Lfl AAAAGAGACA GCAGGAAGTT TTGTGTTTCT GCAAAGACAG AAGCAGTTCA GGCGACAGTA 9 96 0 

H> 

P AG AGG CTGGG GTGTCCAGGA GGGTGTGTCT GGCAGTAGGG TCGCTGGTTT CTCATCCTTG 10 020 

O 

p AACCTAATTG CACTGTCAGT CGGCCCCTCA GGCCTGAGCA GATGGGAAGG TTTGTCCCCT 1008 0 

GCCCTGCAGC AAGAGGGCCC TGTCCAGGAG GCACCCACAA CAGAGGCAGT GCAGGTCTGT 1014 0 

GGTCACTCCT ACTCTCACCT GTGGCGTCTC CCGTAGAGGG ATTGTCAGTT CTGGTTCCCT 10200 

GTGGGCAGGA ATGGTTTCCT CATAGGTCAC TGGAGTTTTG GCCAGGAAAA GAGTATGAAG 1026 0 

TTCATGTGGC AGTTTCTCAA AATTCCTGCT TTCAATGTTG ATGTCCAGTA AAGATATTCG 103 2 0 

TAATTTCAGC TCTATAATCT TAATAGGATT TCCTCTAATA TTGTGAAGCA TATTATATGA 103 8 0 

AACAGGAACA CAAATTTCTC AAAATTCCTG CG ATGTCCAA TAAAGATTTT C AT AATTT C A 10440 

GCTCTGCAAT CTTAATAGGA TTTCCTAATA CTGTAAAGCA TATTAAATGA AACAGGAACT 10 5 00 

CAAATTTGGA GCCCCCTCTC CAGGAGGTTC TGTGTGGAGA TGGTGGCTGT GGCAGTGGCA 10 56 0 

GTTCCCAGGT GCAGAGGGTG GGCAGAGGCA GCCTCAGGCT AAGGGGTCTC CCCTACTCCA 10620 

CATGGAG AAA ATCCC'LTGTA GGTTGCAAGG GCAGTGGCCG GGTGGAATCC CTGCTAGGGA 1.0680 



Ill 

CAGAGCAGGA AGGCCTCGCA GCCTCACCAA GCAGCAGCCC TGGGGTGGAG CTGCGTTTCC 
AGGGTTAAGC GGACCAGGCA GGAGTAGCGG TTACTCAAGA GCAGGTCACA GGCTTGGGTT 
GTGAGGGTCA GGAGAGGCCA GGCCTCCTCG AGCAAGGTGG GGGTCCCAGG GTCAGGTCAG 
GTGCAGATCC TGTGGCAGCC ACGTCTTTCC ATGCTGGGCC TGCTGGGCCC CCCAGGCTTC 
CTGATGGGGT CCCCAGTTAG GAGCTGCCTG CTCAGGGCTG GGAGGGGAGG AGCACTGAGC 
TGCAGATAGA GGGCAGAGCC CACAGTGGGC AGGGCCTGCC CTGGTGTGTA GGTGCCTCTG 
CAGGAGAGGA GGGCCTGGGG ACTGAGAGCA AGGGTCAGGG CCTCTCTTTG GGG AGGCCTC 
TCACTGTAAC AGGACTGGTC AGGCCTGAGA GGAGGGCACT GGGTTCCCTC TTGGGTCTTG 
TCCTTTAGTC TTGGGGCCCT TTCCCTCCCT GCACGATGAG TGGTGGGCAC AGGGCACGGG 
CTGATGTTGA TGGAGTGATG GGAGGGAACT GGCAGGGGCT GGGAAAAGCA AGGAGGGAGG 
AAGAAAAAAG TGGGGGCCTC ATCTTCCCTC AGAGAAAGGG CAAATCTGGT TTTGGAGCAA 
CTGAAGAGAG AAAAGTCCCC AGGGAATAAA CACAACACTG CACCCAGTGG AGCATTTACC 
CATTTCCCTC TTTTCTCCAG AGCTCGTGAG CCTGCAGGTC CTGGATCAAC ACCCAGTTGG 
GACGAGTGAC CACAGGGATG CCACACAGCT CGGATTTCAG CCTCTGATGT CAGCTCTTGG 
GTCCACTGGC TCCACTGAGG GCGCCTAGAC TCTACAGCCA GGCGGCTGGA ATTGAATTCC 
CTGCCTGGAT CTCACAAGCA CTTTCCCTCT TGGTGCCTCA GTTTCCTGAC CTATGAAACA 
GAGAAAATAA AAGCACTTAT TTATTGTTGT TGGAGGCTGC AAAATGTTAG TAGATATGAG 
GCATTTGCAG CTGTGCCATA TT 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 385 amino acids 

(B) TYPE: amino acid 

(C) STRAJWDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Gly Leu Gly Pro Val Phe Leu Leu Leu Ala Gly He Phe Pro 
1 5 10 15 

Ala Pro Pro Gly Ala Ala Ala Glu Pro His Ser Leu Arg Tyr Asn 
20 25 30 

Thr Val Leu Ser Trp Asp Gly Ser Val Gin Ser Gly Phe Leu Ala G 
35 4 0 4 5 
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Val His Leu Asp Gly Gin Pro Phe Leu Arg Tyr A5p Arg Gin Lys Cys 
50 55 GO 

Arg Ala Lys Pro Gin Gly Gin Trp Ala Glu Asp Val Leu Gly Asn Lys 
65 70 75 80 

Thr Trp Asp Arg Glu Thr Arg Asp Leu Thr Gly Asn Gly Lys Asp Leu 
85 90 95 

Arg Met Thr Leu Ala His He Lys Asp Gin Lys Glu Gly Leu His Ser 
100 105 110 

Leu Gin Glu He Arg Val Cys Glu He His Glu Asp Asn Ser Thr Arg 
115 120 . 125 

Ser Ser Gin His Phe Tyr Tyr Asp Gly Glu Leu Phe Leu Ser Gin Asn 
130 135 140 

Val Glu Thr Glu Glu Trp Thr Val Pro Gin Ser Ser Arg Ala Gin Thr 
145 150 155 160 

Leu Ala Met Asn Val Arg Asn Phe Leu Lys Glu Asp Ala Met Lys Thr 
165 170 175 

y3 Lys Thr His Tyr His Ala Met His Ala Asp Cys Leu Gin Glu Leu Arg 

G3 180 185 190 

m 

Ul Arg Tyr Leu Glu Ser Ser Val Val Leu Arg Arg Arg Val Pro Pro Met 

n\ 195 200 205 

fU Val Asn Val Thr Arg Ser Glu Ala Ser Glu Gly Asn He Thr Val Thr 

210 215 220 

Q 

jr Cys Arg Ala Ser Ser Phe Tyr Pro Arg Asn He Thr Leu Thr Trp Arg 

J 225 230 235 240 

fj Gin Asp Gly Val Ser Leu Ser His Asp Thr Gin Gin Trp Gly Asp Val 

W 245 250 255 

Leu Pro Asp Gly Asn Gly Thr Tyr Gin Thr Trp Val Ala Thr Arg He 
260 265 270 

Cys Gin Gly Glu Glu Gin Arg Phe Thr Cys Tyr Met Glu His Ser Gly 
275 280 285 

Asn His Ser Thr His Pro Val Pro Ser Gly Lys Val Leu Val Leu Gin 
290 295 300 

Ser His Trp Gin Thr Phe His Val Ser Ala Val Ala Ala Ala Ala Ala 
305 310 315 320 

Ala He Phe Val He He He Phe Tyr Val Arg Cys Cys Lys Lys Lys 
325 330 . 335 



Thr Ser Ala Ala Glu Gly Pro Glu Leu Val Ser Leu Gin Val Leu Asp 
340 345 350 
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Gin His Pro Val Gly Thr Ser Asp His Arg Asp Ala Thr Gin Leu Gly 
355 360 365 

Phe Gin Pro Leu Met Ser Ala Leu Gly Ser Thr Gly Ser Thr Glu Gly 
370 375 380 

Ala 
385 

(2) INFORMATION FOR SEQ ID NO : 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2380 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GGGCCATGGG GCTGGGCCGG GTCCTGCTGT TTCTGGCCGT CGCCTTCCCT TTTGCACCCC 6 0 

_ CGGCAGCCGC CGCTGAGCCC CACAGTCTTC GTTACAACCT CATGGTGCTG TCCCAGGATG 12 0 

^ AATCTGTGCA GTCAGGGTTT CTCGCTGAGG GACATCTGGA TGGTCAGCCC TTCCTGCGCT 180 

CO 

U1 ATGACAGGCA GAAACGCAGG GCAAAGCCCC AGGGACAGTG GGCAGAAGAT GTC CTGGG AG 240 

S ST 

Li jj 

P CTAAGACCTG GGACACAGAG ACCGAGGACT TGACAGAGAA TGGGCAAGAC CTCAGGAGGA 3 00 

fU CCCTGACTCA TAT C AAGG AC CAGAAAGGAG GCTTGCATTC CCTCCAGGAG ATTAGGGTCT 36 0 

O GTGAGATCCA TGAAGACAGC AGCACCAGGG GCTCCCGGCA TTTCTACTAC GATGGGGAGC 420 

TCTTCCTCTC CCAAAACCTG GAGACTCAAG AATCGACAGT GCCCCAGTCC TCCAGAGCTC 480 

E AGACCTTGGC TATGAACGTC ACAAATTTCT GGAAGGAAGA TGCCATGAAG ACCAAGACAC 54 0 

ACTATCGCGC TATGCAGGCA GACTGCCTGC AGAAACTACA GCGATATCTG AAATCCGGGG 6 00 

TGGCCATCAG GAGAACAGTG CCCCCCATGG TGAATGTCAC CTGCAGCGAG GTCTCAGAGG 66 0 

GCAACATCAC CGTGACATGC AGGGCTTCCA GCTTCTATCC CCGGAATATC ACACTGACCT 72 0 

GGCGTCAGGA TGGGGTATCT TTGAGCCACA ACACCCAGCA GTGGGGGGAT GTCCTGCCTG 78 0 

ATGGGAATGG AACCTACCAG ACCTGGGTGG CCACCAGGAT TCGCCAAGGA GAGGAGCAGA 84 0 

GGTTCACCTG CTACATGGAA CACAGCGGGA ATCACGGCAC TCACCCTGTG CCCTCTGGGA 9 00 

AGGTGCTGGT GCTTCAGAGT CAACGGACAG ACTTTCCATA TGTTTCTGCT GCTATGCCAT 96 0 

GTTTTGTTAT TATTATTATT CTCTGTGTCC CTTGTTGCAA GAAGAAAACA TCTAGCGGCAG 102 0 

AGGGTCCAGA GCTTGTGAGC CTGCAGGTCC TGGATCAACA CCCAGTTGGG ACAGGAGACC 108 0 

ACAGGGATGC AGCACAGCTG GGATTTCAGC CTCTGATGTC AGCTACTGGG TCCACTGGTT 114 0 



In 
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CCACTGAGGG CGCCTAGACT CT AC AG CC AG GCGGCCAGGA TTCAACTCCC TGCCTGGATC 
TCACCAGCAC TTTCCCTCTG TTTCCTGACC TATGAAACAG AAAATAACAT CACTTATTTA 
TTGTTGTTGG ATGCTGCAAA GTGTTAGTAG GTATGAGGTG TTTGCTGCTC TGCCACGTAG 
AGAGCCAGCA AAGGGATCAT GACCAACTCA ACATTCCATT GGAGGCTATA TGATCAAACA 
GCAAATTGTT TATCATGAAT GCAGGATGTG GGCAAACTCA CGACTGCTCC TGCCAACAGA 
AGGTTTG CTG AGGGCATTCA CTCCATGGTG CTCATTGGAG TTATCTACTG GGTCATCTAG 
AGCCTATTGT TTGAGGAATG CAGTCTTACA AGCCTACTCT GGACCCAGCA GCTGACTCCT 
TCTTCCACCC CTCTTCTTGC TATCTCCTAT ACCAATAAAT ACGAAGGGCT GTGGAAGATC 
AGAGCCCTTG TTCACGAGAA GCAAGAAGCC CCCTGACCCC TTGTTCCAAA TATACTCTTT 
TGTCTTTCTC TTTATTCCCA CGTTCGCCCT TTGTTCAGTC CAATACAGGG TTGTGGGGCC 
CTTAACAGTG CCATATTAAT TGGTATCATT ATTTCTGTTG TTTTTGTTTT TGTTTTTGTT 
TTTGTTTTTG AGACAGAGTC TCACTCGTCA CCCAGGCTGC AGTTCACTGG TGTGATCTCA 
GCTCACTGCA ACCTCTGCCT CCCAGGTTCA AGCACTTCTC GTACCTCAGA CTCCCGATAG 
CTGGGATTAC AGACAGGCAC CACCACACCC AGCTAATTTT TGTATTTTTT GTAGAGACGG 
GGTTTCGCCA AGTTGACCAG CCCAGTTTCA AACTCCTGAC CTCAGGTGAT CTGCCTGCCT 
TGGCATCCCA AAGTGCTGGG ATTACAAGAA TGAGCCACCG TGCCTGGCCT ATTTTATTAT 
ATTGTAATAT ATTTTATTAT ATTAGCCACC ATGCCTGTCC TATTTTCTTA TGTTTTAATA 
TATTTTAATA TATTACATGT GCAGTAATTA GATTATCATG GGTGAACTTT ATGAGTGAGT 
ATCTTGGTGA TGACTCCTCC TGACCAGCCC AGGACCAGCT TTCTTGTCAC CTTGAGGTCC 
CCTCGCCCCG TCACACCGTT ATCGATTACT CTGTGTCTAC TATTATGTGT GCATAATTTA 
TACCGTAAAT GTTTACTCTT TAAATAAAAA AAAAAAAAAA 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 383 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Gly Leu Gly Arg Val Leu Leu Phe Leu Ala Val Ala Phe Pro Ph 
1 5 10 IS 

Ala Pro Pro Ala Ala Ala Ala Glu Pro His Ser Leu Arg Tyr Asn Le 
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Met Val Leu Ser Gin Asp Glu Ser Val Gin Ser Gly Phe Leu Ala Glu 
35 40 45 

Gly His Leu Asp Gly Gin Pro Phe Leu Arg Tyr Asp Arg Gin Lys Arg 
50 55 60 

Arg Ala Lys Pro Gin Gly Gin Trp Ala Glu Asp Val Leu Gly Ala Lys 
65 70 75 80 

Thr Trp Asp Thr Glu Thr Glu Asp Leu Thr Glu Asn Gly Gin Asp Leu 
85 90 95 

Arg Arg Thr Leu Thr His lie Lys Asp Gin Lys Gly Gly Leu His Ser 
100 105 110 

Leu Gin Glu lie Arg Val Cys Glu lie His Glu Asp Ser Ser Thr Arg 
115 120 125 

Gly Ser Arg His Phe Tyr Tyr Asp Gly Glu Leu Phe Leu Ser Gin Asn 
130 135 140 

Leu Glu Thr Gin Glu Ser Thr Val Pro Gin Ser Ser Arg Ala Gin Thr 
145 150 155 160 

Leu Ala Met Asn Val Thr Asn Phe Trp Lys Glu Asp Ala Met Lys Thr 
165 170 175 

Lys Thr His Tyr Arg Ala Met Gin Ala Asp Cys Leu Gin Lys Leu Gin 
180 185 190 

Arg Tyr Leu Lys Ser Gly Val Ala lie Arg Arg Thr Val Pro Pro Met 
195 200 205 

Val Asn Val Thr Cys Ser Glu Val Ser Glu Gly Asn lie Thr Val Thr 
210 215 220 

Cys Arg Ala Ser Ser Phe Tyr Pro Arg Asn lie Thr Leu Thr Trp Arg 
225 230 235 240 

Gin Asp Gly Val Ser Leu Ser His Asn Thr Gin Gin Trp Gly Asp Val 
245 250 255 

Leu Pro Asp Gly Asn Gly Thr Tyr Gin Thr Trp Val Ala Thr Arg He 
260 265 270 

Arg Gin Gly Glu Glu Gin Arg Phe Thr Cys Tyr Met Glu His Ser Gly 
275 280 285 

Asn His Gly Thr His Pro Val Pro Ser Gly Lys Val Leu Val Leu Gin 
290 295 300 

Ser Gin Arg Thr Asp Phe Pro Tyr Val Ser Ala Ala Met Pro Cys Phe 
305 310 315 320 



Val He He He He Leu Cys Val Pro Cys Cys Lys Lys Lys Thr Ser 
325 330 335 
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Ala Ala Glu Gly Pro Glu Leu Val Ser Leu Gin Val Leu Asp Gin His 
340 345 350 

Pro Val Gly Thr Gly Asp His Arg Asp Ala Ala Gin Leu Gly Phe Gin 
355 360 365 

Pro Leu Met Ser Ala Thr Gly Ser Thr Gly Ser Thr Glu Gly Ala 
370 375 380 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ACTGGGGAAC AAGGTTTATA TGAGA 2 5 

0 (2) INFORMATION FOR SEQ ID NO : 6: 

OB (i) SEQUENCE CHARACTERISTICS: 
U1 (A) LENGTH: 24 base pairs 

IH (B) TYPE: nucleic acid 

01 <C) STRANDEDNESS: single 
|=4 (D) TOPOLOGY: linear 

n* 

m (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

O 

In TGTCACCCGT CTTCTACAGG ACCC 24 

^b? 3 
i 

E (2) INFORMATION FOR SEQ ID NO : 7: 

^* (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GGGGCCATGG GGCTGGG 17 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ I D NO: 8 
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ATCTGAGATG TCGGTCC 



(2) INFORMATION FOR SEQ ID NO : 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CGTTCTTGTC CCTTTGCCCG TGTGC 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
AACCCTTCCC TTACCCCCGT CGTAG 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
TATGTAAAAC GACGGCCAGT TTCACCTGTG ATTTCCTCTT CCCCA 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GGTCTTTTCA ATCCCCGTCT CTCGTCCAGT ATCGACAAAG GACAT 



(2) INFORMATION FOR SEQ ID NO: 13: 



1 18 

{ i) SEQUENClT^CHARACTERISTICS : 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
TATGTAAAAC GACGGCCAGT TTCGGGAATG GAGAAGTCAC 4 0 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CGAGAGGAGA GGGAGGTTAA CCAGTATCGA CAAAGGACAT 4 0 

£3 

C s (2) INFORMATION FOR SEQ ID NO: 15: 

03 

U1 (i) SEQUENCE CHARACTERISTICS: 

U1 (A) LENGTH: 40 base pairs 

fjl (B) TYPE: nucleic acid 

jt=4 (C) STRANDEDNESS: single 

fU (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TATGTAAAAC GACGGCCAGT GTTCCTCTCC CCTCCTTAGA 4 0 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



AAAAAGTCCC TTTCACGACC ACCAGTATCG ACAAAGGACA T 
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